On Robust Information Extraction from High- Dimensional Data
نویسنده
چکیده
Information extraction from high-dimensional data represents an important problem in current applications in management or econometrics. An important problem from a practical point of view is the sensitivity of machine learning methods with respect to the presence of outlying data values, while numerical stability represents another important aspect of data mining from high-dimensional data. This paper gives an overview of various types of data mining, discusses their suitability for high-dimensional data and critically discusses their properties from the robustness point of view, while we explain that the robustness itself is perceived differently in different contexts.Moreover, we investigate properties of a robust nonlinear regression estimator of Kalina (2013).
منابع مشابه
Robust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data
Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...
متن کاملHyperspectral Image Classification Based on the Fusion of the Features Generated by Sparse Representation Methods, Linear and Non-linear Transformations
The ability of recording the high resolution spectral signature of earth surface would be the most important feature of hyperspectral sensors. On the other hand, classification of hyperspectral imagery is known as one of the methods to extracting information from these remote sensing data sources. Despite the high potential of hyperspectral images in the information content point of view, there...
متن کاملSupervised Feature Extraction of Face Images for Improvement of Recognition Accuracy
Dimensionality reduction methods transform or select a low dimensional feature space to efficiently represent the original high dimensional feature space of data. Feature reduction techniques are an important step in many pattern recognition problems in different fields especially in analyzing of high dimensional data. Hyperspectral images are acquired by remote sensors and human face images ar...
متن کاملData Extraction using Content-Based Handles
In this paper, we present an approach and a visual tool, called HWrap (Handle Based Wrapper), for creating web wrappers to extract data records from web pages. In our approach, we mainly rely on the visible page content to identify data regions on a web page. In our extraction algorithm, we inspired by the way a human user scans the page content for specific data. In particular, we use text fea...
متن کاملDimension Reduction by Mutual Information Feature Extraction
During the past decades, to study high-dimensional data in a large variety of problems, researchers have proposed many Feature Extraction algorithms. One of the most effective approaches for optimal feature extraction is based on mutual information (MI). However it is not always easy to get an accurate estimation for high dimensional MI. In terms of MI, the optimal feature extraction is creatin...
متن کامل